Discriminative Word Alignment via Alignment Matrix Modeling
نویسندگان
چکیده
In this paper a new discriminative word alignment method is presented. This approach models directly the alignment matrix by a conditional random field (CRF) and so no restrictions to the alignments have to be made. Furthermore, it is easy to add features and so all available information can be used. Since the structure of the CRFs can get complex, the inference can only be done approximately and the standard algorithms had to be adapted. In addition, different methods to train the model have been developed. Using this approach the alignment quality could be improved by up to 23 percent for 3 different language pairs compared to a combination of both IBM4alignments. Furthermore the word alignment was used to generate new phrase tables. These could improve the translation quality significantly.
منابع مشابه
Discriminative Modeling of Extraction Sets for Machine Translation
We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate featur...
متن کاملSearch for Discriminative Word Alignment via Dual Decomposition
Shiqi Shen, Yang Liu and Maosong Sun (Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, Tsinghua University, Beijing 100084, China) Abstract: Word alignment aims to calculate the corresponding relationship between the words in parallel texts. It has important influence on machine translation, bilingual dictionary construction and many other natu...
متن کاملUnsupervised Word Alignment with Arbitrary Features
We introduce a discriminatively trained, globally normalized, log-linear variant of the lexical translation models proposed by Brown et al. (1993). In our model, arbitrary, nonindependent features may be freely incorporated, thereby overcoming the inherent limitation of generative models, which require that features be sensitive to the conditional independencies of the generative process. Howev...
متن کاملDiscriminative Word Alignment with Syntactic Features
This report introduces a study on syntactic features used in a discriminative word alignment model. The features are implemented on a state-of-the-art discriminative word alignment system. The syntactic features are extracted from parse trees. Three types of syntactic features are experimented in this work: one global tree path feature and two first order tree features. Experimental results sho...
متن کاملIncorporating Constituent Structure Constraint into Discriminative Word Alignment
We introduce an approach to incorporate the constituent structure constraint into a discriminative word alignment model by presenting the constituent constraint in an explicit way and using three operations to ensure the constraint when search the best word alignment. In this way, we will be able to make use of the weak order constraint induced by the inversion transduction grammars (ITG), as w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008